Rumored specs of the next-generation AMD RDNA 3 powered Radeon RX 7900 XT & NVIDIA Ada Lovelace powered GeForce RTX 4090 graphics playing cards have been detailed as soon as once more. The rumors come from Greymon55 who has been actively posting particulars concerning upcoming {hardware} similar to CPUs and GPUs over on his Twitter feed for some time now.
AMD RDNA 3 & NVIDIA Ada Lovelace GPU Powered Subsequent-Gen Flagship Radeon RX 7900 XT & GeForce RTX 4090 Graphics Playing cards Detailed
The AMD RDNA 3 powered Navi 31 and Ada Lovelace-powered AD102 GPUs are anticipated to convey big efficiency enhancements and also will be probably the most power-hungry chips ever made. Whereas NVIDIA is aiming for a monolithic strategy with its Ada Lovelace structure, AMD will make the most of a full MCM design which it has already taken management in with the launch of the CDNA 2 powered MI200 sequence ‘Aldebaran’. AMD will leverage the identical MCM know-how for consumer-end & gaming GPUs now. So let’s speak in regards to the newest rumored specs that we’ve acquired from the leaker:
AD102
5nm TSMC
144SM 18432CUDA
384bit 24G GDDR6X 21Gbps?
2.3~2.5GHz?
85~92T FP32?— Greymon55 (@greymon55) November 9, 2023
AMD Radeon RX 7900 XT Graphics Card – RDNA 3 Powered Navi 31 Flagship GPU
The AMD Navi 31 GPU, the flagship RDNA 3 chip, would energy the next-gen Radeon RX 7900 XT graphics card. Now we have heard that AMD will drop CU (Compute Models) in favor of WGP (Work Group Processors) on its next-gen RDNA 3 GPUs. Since Navi 31 is an MCM GPU, it’s going to function two key IPs, a GCD (Graphics Core Die) based mostly on TSMC’s 5nm course of and an MCD (Multi-Cache Die) based mostly on the TSMC 6nm course of node. Earlier rumors recommend that AMD has already tapped out its Navi 31 GPU die.
The Navi 31 GPU configuration proven right here options two GCD’s (Graphics Core Die) and a single MCD (Multi-Cache Die). Every GCD has 3 Shader Engines (6 in whole) and every Shader Engine has 2 Shader Arrays (2 per SE / 6 per GCD / 12 in whole). Every Shader Array consists of 5 WGPs (10 per SE / 30 per GCD / 60 in whole) and every WGP options 8 SIMD32 models with 32 ALUs (40 SIMD32 per SA / 80 per SE / 240 per GCD / 480 in whole). These SIMD32 models mix to make up 7,680 cores per GCD and 15,360 cores in whole.
Efficiency-wise, the GPU is predicted to function a clock pace of two.4 – 2.5 GHz which places its theoretical efficiency at round 75 TFLOPs (FP32). That is an insane 226% enchancment vs a Radeon RX 6900 XT graphics card.
The Navi 31 (RDNA 3) MCD might be linked to the twin GCD’s through a next-generation Infinity Cloth interconnect and have 256-512 MB of Infinity Cache. Every GPU also needs to function 4 reminiscence join hyperlinks (32-bit). That is a complete of 8 32-bit reminiscence controllers for a 256-bit bus interface. It’s acknowledged that the cardboard will function as much as 32 GB of GDDR6 reminiscence working at 18 Gbps pin speeds & that delivers as much as 576 GB/s bandwidth. One other rumor that appeared lately means that AMD might be utilizing 3D Infinity Cache know-how on its RDNA 3 lineup which is able to combine the brand new cache in vertical stacks on the GPUs, much like how the Vermeer-X chips will stack L3 cache over the CCD.
AMD RDNA GPU (Generational Comparability) Preliminary:
GPU Identify | Navi 10 | Navi 21 | Navi 31 |
---|---|---|---|
GPU Course of | 7nm | 7nm | 5nm (6nm?) |
GPU Bundle | Monolithic | Monolithic | MCD (Multi-Chiplet Die) |
Shader Engines | 2 | 4 | 6 |
GPU WGPs | 20 | 40 | 30 (Per MCD) 60 (In Whole) |
SPs Per WGP | 128 | 128 | 256 |
Compute Models (Per Die) | 40 | 80 | 120 (per MCD) 240 (in whole) |
Cores (Per Die) | 2560 | 5120 | 7680 |
Cores (Whole) | 2560 | 5120 | 15360 (2 x MCD) |
Reminiscence Bus | 256-bit | 256-bit | 256-bit |
Reminiscence Kind | GDDR6 | GDDR6 | GDDR6 |
Infinity Cache | N/A | 128 MB | 256-512 MB |
Flagship SKU | Radeon RX 5700 XT | Radeon RX 6900 XTX | Radeon RX 7900 XT |
TBP | 225W | 330W | 420-450W |
Launch | Q3 2019 | This autumn 2020 | This autumn 2023 |
NVIDIA GeForce RTX 4090 Graphics Card – Ada Lovelace Powered AD102 Flagship GPU
Primarily based upon earlier rumors, there have been whispers that NVIDIA would make the most of TSMC’s N5 (5nm) course of node for its Ada Lovelace GPUs. This consists of the AD102 SKU too which might be a completely monolithic design. In his newest tweet which talks in regards to the particular GPU configurations, the AD102 GPU is alleged to function a clock pace as excessive as 2.5 GHz (2.3 GHz common enhance). The particular tweet states that the GPU clock for Ada Lovelace ‘AD102’ might be 2.3 GHz or better so let’s take that as a baseline and beforehand leaked specs to determine the place the efficiency ought to land.
The NVIDIA AD102 “ADA GPU” seems to have 18432 CUDA Cores based mostly on the preliminary specs (which may change), housed inside 144 SM models. That is virtually twice the cores current in Ampere which was already an enormous step up from Turing. A 2.3-2.5 GHz clock pace would give us as much as 85 to 92 TFLOPs of compute efficiency (FP32). That is greater than twice the FP32 efficiency of the prevailing RTX 3090 which packs 36 TFLOPs of FP32 compute energy.
The 150% efficiency bounce seems to be big however one ought to do not forget that NVIDIA already gave a giant bounce in FP32 numbers this technology with Ampere. The Ampere GA102 GPU (RTX 3090) provides 36 TFLOPs whereas the Turing TU102 GPU (RTX 2080 Ti) provided 13 TFLOPs. That is over a 150% enhance in FP32 Flops however the real-world gaming efficiency enhance for the RTX 3090 averaged at round 50-60% quicker over the RTX 2080 Ti. So one factor we should not neglect is that Flops do not equal GPU gaming efficiency today. Moreover, we do not know if 2.3-2.5 GHz is the typical enhance or the height enhance with the previous which means that there might be even increased compute potential for AD102.
Apart from that, the leaker additionally states that the NVIDIA GeForce RTX 40 flagship would retain a 384-bit bus interface, much like the RTX 3090. What’s fascinating is although that the leaker mentions G6X which signifies that NVIDIA will not be shifting to a brand new reminiscence customary till after Ada Lovelace and make the most of the upper pin-speeds of G6X of 21 Gbps for its next-generation playing cards earlier than we see a more recent customary (e.g. GDDR7). The cardboard will function 24 GB of reminiscence so we will both count on single-sided 16Gb DRAM or dual-sided 8Gb DRAM modules.
NVIDIA CUDA GPU (RUMORED) Preliminary:
GPU | TU102 | GA102 | AD102 |
---|---|---|---|
Structure | Turing | Ampere | Ada Lovelace |
Course of | TSMC 12nm NFF | Samsung 8nm | 5nm |
Graphics Processing Clusters (GPC) | 6 | 7 | 12 |
Texture Processing Clusters (TPC) | 36 | 42 | 72 |
Streaming Multiprocessors (SM) | 72 | 84 | 144 |
CUDA Cores | 4608 | 10752 | 18432 |
Theoretical TFLOPs | 16.1 | 37.6 | ~90 TFLOPs? |
Reminiscence Kind | GDDR6 | GDDR6X | GDDR6X |
Reminiscence Bus | 384-bit | 384-bit | 384-bit |
Reminiscence Capability | 11 GB (2080 Ti) | 24 GB (3090) | 24 GB (4090?) |
Flagship SKU | RTX 2080 Ti | RTX 3090 | RTX 4090? |
TGP | 250W | 350W | 450-650W? |
Launch | Sep. 2018 | Sept. 20 | 2023 (TBC) |
The NVIDIA Ada Lovelace GPUs will energy the next-generation GeForce RTX 40 graphics playing cards that can go head-on with AMD’s RDNA 3 based mostly Radeon RX 7000 sequence graphics playing cards. There’s nonetheless some hypothesis concerning using MCM by NVIDIA. The Hopper GPU, which is primarily aimed on the Datacenter & AI section, is allegedly taping out quickly and can function an MCM structure. NVIDIA will not be utilizing an MCM design on its Ada Lovelace GPUs so they’ll hold the standard monolithic design.