The
TeraScale microarchitecture is based on this chip, the shader units are organized in three
SIMD groups with 16 processors per group, for a total of 48 processors. Each of these processors is composed of a 5-wide vector unit (total 5 FP32
ALUs), resulting in 240 units, that can serially execute up to two instructions per cycle (a multiply and an addition). All processors in a SIMD group execute the same instruction, so in total up to three instruction threads can be simultaneously under execution.
500 MHz parent GPU on
90 nm,
65 nm (since 2008)
TSMC process or
45 nmGlobalFoundries process (since 2010, with
CPU on same die) of total 232 million transistors
240 vector units
floating-point vector processors for
shader execution, divided in three dynamically scheduled SIMD groups of 80 units each.[3]
NEC designed eDRAM die includes additional logic (192 parallel pixel processors) for color,
alpha compositing, alpha blending,
Z/
stencil buffering, and
anti-aliasing called "Intelligent Memory", giving developers 4-sample
anti-aliasing at very little performance cost.
Procedural Synthesis Technology (XPS): During read streaming into the CPU, a custom prefetch instruction, extended
data cache block touch (xDCBT)
prefetches data directly to the L1 data cache of the intended core, which skips putting the data in the L2 cache to avoid thrashing the L2 cache. Writes streaming from each core skip the L1 cache, due to its no-write allocation (avoids thrashing of high-bandwidth, transient, write-only data streams on the L1 cache), and goes directly to the L2 cache. The system allows for the GPU to directly read data produced by the CPU without going to main memory. In this specific case of data streaming, called Xbox procedural synthesis (XPS), the CPU is effectively a data decompressor, generating geometry on-the-fly for consumption by the GPU 3D core.
Maximum
pixel fillrate: 16 gigasamples per second fillrate using 4X multisample anti aliasing (MSAA), or 32 gigasamples using Z-only operation; 4 gigapixels per second without MSAA (8 ROPs × 500 MHz)
Maximum Z sample rate: 8 gigasamples per second (2 Z samples × 8 ROPs × 500 MHz), 32 gigasamples per second using 4X anti aliasing (2 Z samples × 8 ROPs × 4X AA × 500 MHz)[1]
Maximum anti-aliasing sample rate: 16 gigasamples per second (4 AA samples × 8 ROPs × 500 MHz)[1]
Support for bilinear, trilinear, anisotropic filtering, Alpha to Coverage, hardware Tessellation and Predicated Tiling.[7]
Cooling: Both the GPU and CPU of the console have
heatsinks. The GPU's heatsink uses
heatpipe technology, to conduct heat from the GPU and eDRAM die to the fins of the heatsink. The heatsinks are actively cooled by a pair of 60 mm exhaust fans. The new
XCGPU chipset redesign is featured in both the Xbox 360 S and the Xbox 360 E and integrates the CPU (
Xenon) and GPU (Xenos) in one package and is actively cooled by a single heatsink rather than two.
The
TeraScale microarchitecture is based on this chip, the shader units are organized in three
SIMD groups with 16 processors per group, for a total of 48 processors. Each of these processors is composed of a 5-wide vector unit (total 5 FP32
ALUs), resulting in 240 units, that can serially execute up to two instructions per cycle (a multiply and an addition). All processors in a SIMD group execute the same instruction, so in total up to three instruction threads can be simultaneously under execution.
500 MHz parent GPU on
90 nm,
65 nm (since 2008)
TSMC process or
45 nmGlobalFoundries process (since 2010, with
CPU on same die) of total 232 million transistors
240 vector units
floating-point vector processors for
shader execution, divided in three dynamically scheduled SIMD groups of 80 units each.[3]
NEC designed eDRAM die includes additional logic (192 parallel pixel processors) for color,
alpha compositing, alpha blending,
Z/
stencil buffering, and
anti-aliasing called "Intelligent Memory", giving developers 4-sample
anti-aliasing at very little performance cost.
Procedural Synthesis Technology (XPS): During read streaming into the CPU, a custom prefetch instruction, extended
data cache block touch (xDCBT)
prefetches data directly to the L1 data cache of the intended core, which skips putting the data in the L2 cache to avoid thrashing the L2 cache. Writes streaming from each core skip the L1 cache, due to its no-write allocation (avoids thrashing of high-bandwidth, transient, write-only data streams on the L1 cache), and goes directly to the L2 cache. The system allows for the GPU to directly read data produced by the CPU without going to main memory. In this specific case of data streaming, called Xbox procedural synthesis (XPS), the CPU is effectively a data decompressor, generating geometry on-the-fly for consumption by the GPU 3D core.
Maximum
pixel fillrate: 16 gigasamples per second fillrate using 4X multisample anti aliasing (MSAA), or 32 gigasamples using Z-only operation; 4 gigapixels per second without MSAA (8 ROPs × 500 MHz)
Maximum Z sample rate: 8 gigasamples per second (2 Z samples × 8 ROPs × 500 MHz), 32 gigasamples per second using 4X anti aliasing (2 Z samples × 8 ROPs × 4X AA × 500 MHz)[1]
Maximum anti-aliasing sample rate: 16 gigasamples per second (4 AA samples × 8 ROPs × 500 MHz)[1]
Support for bilinear, trilinear, anisotropic filtering, Alpha to Coverage, hardware Tessellation and Predicated Tiling.[7]
Cooling: Both the GPU and CPU of the console have
heatsinks. The GPU's heatsink uses
heatpipe technology, to conduct heat from the GPU and eDRAM die to the fins of the heatsink. The heatsinks are actively cooled by a pair of 60 mm exhaust fans. The new
XCGPU chipset redesign is featured in both the Xbox 360 S and the Xbox 360 E and integrates the CPU (
Xenon) and GPU (Xenos) in one package and is actively cooled by a single heatsink rather than two.