Intel ARC graphics playing cards based mostly on the Alchemist Xe-HPG GPUs are all set for launch subsequent yr and based mostly on the specs, we might be very aggressive efficiency numbers in opposition to AMD and NVIDIA GPUs.
Intel’s Flagship ARC Graphics Playing cards With Xe-HPG Alchemist GPU To Be Extremely Aggressive Towards NVIDIA GA104 & AMD Navi 22
The primary Intel ARC graphics playing cards shall be powered by the Alchemist GPUs based mostly on the Xe-HPG structure. Intel has to date confirmed that the primary discrete graphics playing cards will hit retail by Q1 2023 and shall be based mostly on the TSMC 6nm course of node. Intel additionally detailed the specs of Alchemist GPUs and the core constructing blocks which embody the Xe-Core.
Intel ARC Xe-HPG Alchemist GPU – The Constructing Blocks
So rounding up what we realized, the Intel Xe-HPG Alchemist GPU contains a Xe-Core which is the basic DNA of the first Gen ARC lineup. The Xe-Core is a compute block that’s composed of 16 Vector Engines (256-bit per engine) and 16 Matrix Engines (1024-bit per engine). Every Vector Engine consists of 8 ALUs so, in complete, we’re 128 ALUs per Xe-Core. Every Matrix Engine block can also be known as an XMX block which can deal with tensor operations in each FP16 and INT8 modes. The Xe-Core additional options its personal devoted L1 cache.
Intel fuses 4 Xe-Cores collectively to type a Render Slice which consists of 4 Ray Tracing Items, 4 Sampler Items, Geometry/Rasterize/HiZ engines, and two Pixel Backend blocks with 8 items on every. These Render Slices are put collectively to type the primary GPUs. The flagship consists of an 8 Render Slice configuration which options 32 Xe-Cores, 512 Vector Engines, and 4096 ALUs. There shall be completely different configurations with 2, 4, 6 Render Slices however we’re specializing in the flagship half on this report.
Intel ARC Alchemist vs NVIDIA GA104 & AMD Navi 22 GPUs
GPU Identify | Alchemist DG-512 | NVIDIA GA104 | AMD Navi 22 |
---|---|---|---|
Structure | Xe-HPG | Ampere | RDNA 2 |
Course of Node | TSMC 6nm | Samsung 8nm | TSMC 7nm |
Flagship Product | ARC (TBA) | GeForce RTX 3070 Ti | Radeon RX 6700 XT |
Raster Engine | 8 | 6 | 2 |
FP32 Cores | 32 Xe Cores | 48 SM Items | 40 Compute Items |
FP32 Items | 4096 | 6144 | 2560 |
FP32 Compute | ~16 TFLOPs | 21.7 TFLOPs | 12.4 TFLOPs |
TMUs | 256 | 192 | 160 |
ROPs | 128 | 96 | 64 |
RT Cores | 32 RT Items | 48 RT Cores (V2) | 40 RA Items |
Tensor Cores | 512 XMX Cores | 192 Tensor Cores (V3) | N/A |
Tensor Compute | ~131 TFLOPs FP16 ~262 TOPs INT8 |
87 TFLOPs FP16 174 TOPs INT8 |
25 TFLOPs FP16 50 TOPs INT8 |
L2 Cache | TBA | 4 MB | 3 MB |
Further Cache | 16 MB Sensible Cache? | N/A | 96 MB Infinity Cache |
Reminiscence Bus | 256-bit | 256-bit | 192-bit |
Reminiscence Capability | 16 GB GDDR6 | 8 GB GDDR6X | 16 GB GDDR6 |
Launch | Q1 2023 | Q2 2023 | Q1 2023 |
Intel ARC Xe-HPG Alchemist GPU – Evaluating It To NVIDIA’s GA104 & AMD’s Navi 22
A rundown of the specs and comparability has been made by 3DCenter which provides us an concept of the theoretical efficiency that Intel’s new GPU must supply. So proper off the bat, Intel’s ARC Xe-HPG Alchemist flagship will supply extra TMUs and ROPs than the NVIDIA and AMD competitors. The core rely at 4096 is greater than AMD’s Navi 22, Navi 21 (RX 6800) however decrease in comparison with NVIDIA’s GA104. NVIDIA is utilizing a twin FP32 numbering methodology and will theoretically be 3072.
Intel’s ARC Alchemist GPUs have decrease ray tracing items than the competitors however we do not know precisely how their Ray tracing implementation works. For instance, whereas Navi 22 affords extra RT cores than the GA106 Ampere GPUs, the hardware-level integration inside NVIDIA’s RT cores is superior in all regards to AMD’s implementation. So the ultimate efficiency would depend on Intel’s hardware-level integration and software-level optimization for ray tracing functions.
A significant lead that Intel may have over the competitors, particularly NVIDIA since AMD lacks on this division, is AI help in supersampling applied sciences. Intel has already showcased a powerful demo of its XeSS know-how and based mostly on the anticipated numbers, Intel GPUs may outperform NVIDIA’s Tensor Core implementation (DLSS) with its XMX structure. Intel can also be anticipated to characteristic a small however helpful recreation cache on its GPUs and shall be geared up with greater VRAM capacities of as much as 16 GB (GDDR6) throughout a 256-bit bus interface. This may be twice as a lot reminiscence as NVIDIA’s RTX 3070 and RTX 3070 Ti so they could have to arrange a refresh to counter it.
Lastly, the theoretical FP32 compute efficiency is computed with an anticipated peak clock fee of two GHz. That is the probably situation for TSMC’s 6nm course of node given how nicely clocks scale on TSMC’s 7nm course of node. Primarily based on that, the Intel Xe-HPG Alchemist GPU may supply round 16-17 TFLOPs of Compute energy. That is barely decrease FLOPs than what NVIDIA’s GA104 produces but it surely must be famous that not all FLOPs must be measured equally as gaming structure runs very completely different in comparison with datacenter chips.
Primarily based on these early specs, we’re an Intel graphics card that might find yourself being sooner than AMD’s Radeon RX 6700 XT and NVIDIA’s RTX 3070 with ease. To push its 1st Gen graphics playing cards additional into the buyer phase, Intel could probably supply very aggressive costs in opposition to established giants like AMD and NVIDIA. And together with a powerful suite of software-level optimizations, they may have a win-win of their arms which can solely be pushed ahead with future generations of ARC GPUs.
Intel ARC Alchemist vs NVIDIA GA104 & AMD Navi 22 GPUs
GPU Identify | Alchemist DG-512 | NVIDIA GA104 | AMD Navi 22 |
---|---|---|---|
Structure | Xe-HPG | Ampere | RDNA 2 |
Course of Node | TSMC 6nm | Samsung 8nm | TSMC 7nm |
Flagship Product | ARC (TBA) | GeForce RTX 3070 Ti | Radeon RX 6700 XT |
Raster Engine | 8 | 6 | 2 |
FP32 Cores | 32 Xe Cores | 48 SM Items | 40 Compute Items |
FP32 Items | 4096 | 6144 | 2560 |
FP32 Compute | ~16 TFLOPs | 21.7 TFLOPs | 12.4 TFLOPs |
TMUs | 256 | 192 | 160 |
ROPs | 128 | 96 | 64 |
RT Cores | 32 RT Items | 48 RT Cores (V2) | 40 RA Items |
Tensor Cores | 512 XMX Cores | 192 Tensor Cores (V3) | N/A |
Tensor Compute | ~131 TFLOPs FP16 ~262 TOPs INT8 |
87 TFLOPs FP16 174 TOPs INT8 |
25 TFLOPs FP16 50 TOPs INT8 |
L2 Cache | TBA | 4 MB | 3 MB |
Further Cache | 16 MB Sensible Cache? | N/A | 96 MB Infinity Cache |
Reminiscence Bus | 256-bit | 256-bit | 192-bit |
Reminiscence Capability | 16 GB GDDR6 | 8 GB GDDR6X | 16 GB GDDR6 |
Launch | Q1 2023 | Q2 2023 | Q1 2023 |