News

AMD & NVIDIA Subsequent-Gen Flagship GPUs Detailed: RDNA 3 Radeon RX 7900 XT With 15360 Cores, Ada Lovelace GeForce RTX 4090 With 18432 Cores

Written by Jeff Lampkin

Rumored specs of the next-generation AMD RDNA 3 powered Radeon RX 7900 XT & NVIDIA Ada Lovelace powered GeForce RTX 4090 graphics playing cards have been detailed as soon as once more. The rumors come from Greymon55 who has been actively posting particulars concerning upcoming {hardware} similar to CPUs and GPUs over on his Twitter feed for some time now.

AMD RDNA 3 & NVIDIA Ada Lovelace GPU Powered Subsequent-Gen Flagship Radeon RX 7900 XT & GeForce RTX 4090 Graphics Playing cards Detailed

The AMD RDNA 3 powered Navi 31 and Ada Lovelace-powered AD102 GPUs are anticipated to convey big efficiency enhancements and also will be probably the most power-hungry chips ever made. Whereas NVIDIA is aiming for a monolithic strategy with its Ada Lovelace structure, AMD will make the most of a full MCM design which it has already taken management in with the launch of the CDNA 2 powered MI200 sequence ‘Aldebaran’. AMD will leverage the identical MCM know-how for consumer-end & gaming GPUs now. So let’s speak in regards to the newest rumored specs that we’ve acquired from the leaker:

SK Hynix HBM3 Reminiscence Module Revealed Throughout OCP Summit 2023 – 12-Hello Stack, 24GB Module With 6400 Mbps Switch Speeds

AMD Radeon RX 7900 XT Graphics Card – RDNA 3 Powered Navi 31 Flagship GPU

The AMD Navi 31 GPU, the flagship RDNA 3 chip, would energy the next-gen Radeon RX 7900 XT graphics card. Now we have heard that AMD will drop CU (Compute Models) in favor of WGP (Work Group Processors) on its next-gen RDNA 3 GPUs. Since Navi 31 is an MCM GPU, it’s going to function two key IPs, a GCD (Graphics Core Die) based mostly on TSMC’s 5nm course of and an MCD (Multi-Cache Die) based mostly on the TSMC 6nm course of node. Earlier rumors recommend that AMD has already tapped out its Navi 31 GPU die.

The Navi 31 GPU configuration proven right here options two GCD’s (Graphics Core Die) and a single MCD (Multi-Cache Die). Every GCD has 3 Shader Engines (6 in whole) and every Shader Engine has 2 Shader Arrays (2 per SE / 6 per GCD / 12 in whole). Every Shader Array consists of 5 WGPs (10 per SE / 30 per GCD / 60 in whole) and every WGP options 8 SIMD32 models with 32 ALUs (40 SIMD32 per SA / 80 per SE / 240 per GCD / 480 in whole). These SIMD32 models mix to make up 7,680 cores per GCD and 15,360 cores in whole.

Efficiency-wise, the GPU is predicted to function a clock pace of two.4 – 2.5 GHz which places its theoretical efficiency at round 75 TFLOPs (FP32). That is an insane 226% enchancment vs a Radeon RX 6900 XT graphics card.

New Raptoreum Crypto Mining Algorithm Turns AMD Ryzen CPUs With Big L3 Caches In To Cash Making Machines

The Navi 31 (RDNA 3) MCD might be linked to the twin GCD’s through a next-generation Infinity Cloth interconnect and have 256-512 MB of Infinity Cache. Every GPU also needs to function 4 reminiscence join hyperlinks (32-bit). That is a complete of 8 32-bit reminiscence controllers for a 256-bit bus interface. It’s acknowledged that the cardboard will function as much as 32 GB of GDDR6 reminiscence working at 18 Gbps pin speeds & that delivers as much as 576 GB/s bandwidth. One other rumor that appeared lately means that AMD might be utilizing 3D Infinity Cache know-how on its RDNA 3 lineup which is able to combine the brand new cache in vertical stacks on the GPUs, much like how the Vermeer-X chips will stack L3 cache over the CCD.

AMD RDNA GPU (Generational Comparability) Preliminary:

GPU IdentifyNavi 10Navi 21Navi 31
GPU Course of7nm7nm5nm (6nm?)
GPU BundleMonolithicMonolithicMCD (Multi-Chiplet Die)
Shader Engines246
GPU WGPs204030 (Per MCD)
60 (In Whole)
SPs Per WGP128128256
Compute Models (Per Die)4080120 (per MCD)
240 (in whole)
Cores (Per Die)256051207680
Cores (Whole)2560512015360 (2 x MCD)
Reminiscence Bus256-bit256-bit256-bit
Reminiscence KindGDDR6GDDR6GDDR6
Infinity CacheN/A128 MB256-512 MB
Flagship SKURadeon RX 5700 XTRadeon RX 6900 XTXRadeon RX 7900 XT
TBP225W330W420-450W
LaunchQ3 2019This autumn 2020This autumn 2023

NVIDIA GeForce RTX 4090 Graphics Card – Ada Lovelace Powered AD102 Flagship GPU

Primarily based upon earlier rumors, there have been whispers that NVIDIA would make the most of TSMC’s N5 (5nm) course of node for its Ada Lovelace GPUs. This consists of the AD102 SKU too which might be a completely monolithic design. In his newest tweet which talks in regards to the particular GPU configurations, the AD102 GPU is alleged to function a clock pace as excessive as 2.5 GHz (2.3 GHz common enhance). The particular tweet states that the GPU clock for Ada Lovelace ‘AD102’ might be 2.3 GHz or better so let’s take that as a baseline and beforehand leaked specs to determine the place the efficiency ought to land.

The NVIDIA AD102 “ADA GPU” seems to have 18432 CUDA Cores based mostly on the preliminary specs (which may change), housed inside 144 SM models. That is virtually twice the cores current in Ampere which was already an enormous step up from Turing. A 2.3-2.5 GHz clock pace would give us as much as 85 to 92 TFLOPs of compute efficiency (FP32). That is greater than twice the FP32 efficiency of the prevailing RTX 3090 which packs 36 TFLOPs of FP32 compute energy.

The 150% efficiency bounce seems to be big however one ought to do not forget that NVIDIA already gave a giant bounce in FP32 numbers this technology with Ampere. The Ampere GA102 GPU (RTX 3090) provides 36 TFLOPs whereas the Turing TU102 GPU (RTX 2080 Ti) provided 13 TFLOPs. That is over a 150% enhance in FP32 Flops however the real-world gaming efficiency enhance for the RTX 3090 averaged at round 50-60% quicker over the RTX 2080 Ti. So one factor we should not neglect is that Flops do not equal GPU gaming efficiency today. Moreover, we do not know if 2.3-2.5 GHz is the typical enhance or the height enhance with the previous which means that there might be even increased compute potential for AD102.

Apart from that, the leaker additionally states that the NVIDIA GeForce RTX 40 flagship would retain a 384-bit bus interface, much like the RTX 3090. What’s fascinating is although that the leaker mentions G6X which signifies that NVIDIA will not be shifting to a brand new reminiscence customary till after Ada Lovelace and make the most of the upper pin-speeds of G6X of 21 Gbps for its next-generation playing cards earlier than we see a more recent customary (e.g. GDDR7). The cardboard will function 24 GB of reminiscence so we will both count on single-sided 16Gb DRAM or dual-sided 8Gb DRAM modules.

NVIDIA CUDA GPU (RUMORED) Preliminary:

GPUTU102GA102AD102
StructureTuringAmpereAda Lovelace
Course ofTSMC 12nm NFFSamsung 8nm5nm
Graphics Processing Clusters (GPC)6712
Texture Processing Clusters (TPC)364272
Streaming Multiprocessors (SM)7284144
CUDA Cores46081075218432
Theoretical TFLOPs 16.137.6~90 TFLOPs?
Reminiscence KindGDDR6GDDR6XGDDR6X
Reminiscence Bus384-bit384-bit384-bit
Reminiscence Capability11 GB (2080 Ti)24 GB (3090)24 GB (4090?)
Flagship SKURTX 2080 TiRTX 3090RTX 4090?
TGP250W350W450-650W?
LaunchSep. 2018Sept. 202023 (TBC)

The NVIDIA Ada Lovelace GPUs will energy the next-generation GeForce RTX 40 graphics playing cards that can go head-on with AMD’s RDNA 3 based mostly Radeon RX 7000 sequence graphics playing cards. There’s nonetheless some hypothesis concerning using MCM by NVIDIA. The Hopper GPU, which is primarily aimed on the Datacenter & AI section, is allegedly taping out quickly and can function an MCM structure. NVIDIA will not be utilizing an MCM design on its Ada Lovelace GPUs so they’ll hold the standard monolithic design.

Which next-generation GPUs are you trying ahead to probably the most?

About the author

Jeff Lampkin

Jeff Lampkin was the first writer to have joined gamepolar.com. He has since then inculcated very effective writing and reviewing culture at GamePolar which rivals have found impossible to imitate. His approach has been to work on the basics while the whole world was focusing on the superstructures.